A Partitioning Framework for Aggressive Data Skipping

نویسندگان

  • Liwen Sun
  • Sanjay Krishnan
  • Reynold Xin
  • Michael J. Franklin
چکیده

We propose to demonstrate a fine-grained partitioning framework that reorganizes the data tuples into small blocks at data loading time. The goal is to enable queries to maximally skip scanning data blocks. The partition framework consists of four steps: (1) workload analysis, which extracts features from a query workload, (2) augmentation, which augments each data tuple with a feature vector, (3) reduce, which succinctly represents a set of data tuples using a set of feature vectors, and (4) partitioning, which performs a clustering algorithm to partition the feature vectors and uses the clustering result to guide the actual data partitioning. Our experiments show that our techniques result in a 37x query response time improvement over traditional range partitioning due to more e↵ective data skipping.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Skipping-oriented Partitioning for Columnar Layouts

As data volumes continue to grow, modern database systems increasingly rely on data skipping mechanisms to improve performance by avoiding access to irrelevant data. Recent work [39] proposed a fine-grained partitioning scheme that was shown to improve the opportunities for data skipping in row-oriented systems. Modern analytics and big data systems increasingly adopt columnar storage schemes, ...

متن کامل

Design and Evaluation of a Method for Partitioning and Offloading Web-based Applications in Mobile Systems with Bandwidth Constraints

Computation offloading is known to be among the effective solutions of running heavy applications on smart mobile devices. However, irregular changes of a mobile data rate have direct impacts on code partitioning when offloading is in progress. It is believed that once a rate-adaptive partitioning performed, the replication of such substantial processes due to bandwidth fluctuation can be avoid...

متن کامل

An Investigation in Mathematical Performance of Students Who Do Grade-skipping.

The main purpose of this study was to compare the performance of grade-skipped students with their peers in mathematical reasoning and applying. In this study, gender and mathematical self-concept were considered as effective variables. This study was a part of a longitudinal study.  The data analysis was performed through repeated measurements and the results showed that in applying math conce...

متن کامل

Haplotype Block Partitioning and tagSNP Selection under the Perfect Phylogeny Model

Single Nucleotide Polymorphisms (SNPs) are the most usual form of polymorphism in human genome.Analyses of genetic variations have revealed that individual genomes share common SNP-haplotypes. Theparticular pattern of these common variations forms a block-like structure on human genome. In this work,we develop a new method based on the Perfect Phylogeny Model to identify haplo...

متن کامل

Comparing the Effects of Eight Weeks of Whole Body Vibration Exercise Combined With Rope Skipping at two Different Intensities on Physical Performance of Older Men: A Randomized Single-Blind Clinical Trial

Objectives: Whole-Body Vibration (WBV) exercise seems to be an effective alternative to improve physical performance in the elderly. This study aims to compare the effects of eight weeks of WBV exercise combined with rope skipping at two different intensities on physical performance of older men. Methods & Materials: This is a randomized single-blind clinical trial. Participants were 30 older ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • PVLDB

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2014